Goto

Collaborating Authors

 orthogonal regularization



Expandable and Differentiable Dual Memories with Orthogonal Regularization for Exemplar-free Continual Learning

Moon, Hyung-Jun, Cho, Sung-Bae

arXiv.org Artificial Intelligence

Continual learning methods used to force neural networks to process sequential tasks in isolation, preventing them from leveraging useful inter-task relationships and causing them to repeatedly relearn similar features or overly differentiate them. To address this problem, we propose a fully differentiable, exemplar-free expandable method composed of two complementary memories: One learns common features that can be used across all tasks, and the other combines the shared features to learn discriminative characteristics unique to each sample. Both memories are differentiable so that the network can autonomously learn latent representations for each sample. For each task, the memory adjustment module adaptively prunes critical slots and minimally expands capacity to accommodate new concepts, and orthogonal regularization enforces geometric separation between preserved and newly learned memory components to prevent interference. Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet show that the proposed method outperforms 14 state-of-the-art methods for class-incremental learning, achieving final accuracies of 55.13\%, 37.24\%, and 30.11\%, respectively. Additional analysis confirms that, through effective integration and utilization of knowledge, the proposed method can increase average performance across sequential tasks, and it produces feature extraction results closest to the upper bound, thus establishing a new milestone in continual learning.


A Geometric interpretation of regularization

Neural Information Processing Systems

C HCP-Rest Resting-state Rest 1093 1200 2 HCP-Task Working Memory Task, Rest 1087 405 7 Social Mental, Random, Rest 1053 274 Relational Task, Rest 1043 232 Motor (L,R).(Hand,Foot),


propagation on the DAVIS dataset (Table 1), in comparison to a SOT A3 self-supervised method [49] and the ImageNet pre-trained representation

Neural Information Processing Systems

Model J (Mean) Self-supervised, SOT A [49] 43.0 ImageNet Representation 49.4 Self-supervised, Ours 57.7 The shared affinity matrix bridges these tasks, and facilitates iterative improvements. These contributions are significant in the field of self-supervised learning. The contributions of this work are also demonstrated by our ablation study, i.e., Table 2 in the paper. We note that these components are novel and have not been explored in prior work. In the following, we address the other comments by reviewers.


Tokenize features, enhancing tables: the FT-TABPFN model for tabular classification

Liu, Quangao, Yang, Wei, Liang, Chen, Pang, Longlong, Zou, Zhuozhang

arXiv.org Artificial Intelligence

Traditional methods for tabular classification usually rely on supervised learning from scratch, which requires extensive training data to determine model parameters. However, a novel approach called Prior-Data Fitted Networks (TabPFN) has changed this paradigm. TabPFN uses a 12-layer transformer trained on large synthetic datasets to learn universal tabular representations. This method enables fast and accurate predictions on new tasks with a single forward pass and no need for additional training. Although TabPFN has been successful on small datasets, it generally shows weaker performance when dealing with categorical features. To overcome this limitation, we propose FT-TabPFN, which is an enhanced version of TabPFN that includes a novel Feature Tokenization layer to better handle classification features. By fine-tuning it for downstream tasks, FT-TabPFN not only expands the functionality of the original model but also significantly improves its applicability and accuracy in tabular classification. Our full source code is available for community use and development.


Estimating Average Treatment Effects via Orthogonal Regularization

Hatt, Tobias, Feuerriegel, Stefan

arXiv.org Machine Learning

Decision-making often requires accurate estimation of treatment effects from observational data. This is challenging as outcomes of alternative decisions are not observed and have to be estimated. Previous methods estimate outcomes based on unconfoundedness but neglect any constraints that unconfoundedness imposes on the outcomes. In this paper, we propose a novel regularization framework for estimating average treatment effects that exploits unconfoundedness. To this end, we formalize unconfoundedness as an orthogonality constraint, which ensures that the outcomes are orthogonal to the treatment assignment. This orthogonality constraint is then included in the loss function via a regularization. Based on our regularization framework, we develop deep orthogonal networks for unconfounded treatments (DONUT), which learn outcomes that are orthogonal to the treatment assignment. Using a variety of benchmark datasets for estimating average treatment effects, we demonstrate that DONUT outperforms the state-of-the-art substantially.


Neural Photo Editing with Introspective Adversarial Networks

Brock, Andrew, Lim, Theodore, Ritchie, J. M., Weston, Nick

arXiv.org Machine Learning

The increasingly photorealistic sample quality of generative image models suggests their feasibility in applications beyond image generation. We present the Neural Photo Editor, an interface that leverages the power of generative neural networks to make large, semantically coherent changes to existing images. To tackle the challenge of achieving accurate reconstructions without loss of feature quality, we introduce the Introspective Adversarial Network, a novel hybridization of the VAE and GAN. Our model efficiently captures long-range dependencies through use of a computational block based on weight-shared dilated convolutions, and improves generalization performance with Orthogonal Regularization, a novel weight regularization method. We validate our contributions on CelebA, SVHN, and CIFAR-100, and produce samples and reconstructions with high visual fidelity.


Deep Multimodal Hashing with Orthogonal Regularization

Wang, Daixin (Tsinghua University) | Cui, Peng (Tsinghua University) | Ou, Mingdong (Tsinghua University) | Zhu, Wenwu (Tsinghua University)

AAAI Conferences

Hashing is an important method for performing efficient similarity search. With the explosive growth of multimodal data, how to learn hashing-based compact representations for multimodal data becomes highly non-trivial. Compared with shallow structured models, deep models present superiority in capturing multimodal correlations due to their high nonlinearity. However, in order to make the learned representation more accurate and compact, how to reduce the redundant information lying in the multimodal representations and incorporate different complexities of different modalities in the deep models is still an open problem. In this paper, we propose a novel deep multimodal hashing method, namely Deep Multimodal Hashing with Orthogonal Regularization (DMHOR), which fully exploits intra-modality and inter-modality correlations. In particular, to reduce redundant information, we impose orthogonal regularizer on the weighting matrices of the model, and theoretically prove that the learned representation is guaranteed to be approximately orthogonal. Moreover, we find that a better representation can be attained with different numbers of layers for different modalities, due to their different complexities. Comprehensive experiments on WIKI and NUS-WIDE, demonstrate a substantial gain of DMHOR compared with state-of-the-art methods.